von Wittgenstein et al. BMC Evolutionary Biology 2014, 14:1 1 
http://www.bionnedcentral.conn/1471 -21 48/1 4/1 1 



Evolutionary Biology 



RESEARCH ARTICLE Open Access 



Evolutionary classification of ammonium, nitrate, 
and peptide transporters in land plants 

Neil JJB von Wittgenstein\ Cuong H Le^, Barbara J Hawkins^ and Jurgen Ehlting^"" 



Abstract 

Background: Nitrogen uptal<e, reallocation within the plant, and between subcellular compartments involves 
ammonium, nitrate and peptide transporters. Ammonium transporters are separated into two distinct families 
(AMTl and A)\/1T2), each comprised of five members on average in angiosperms. Nitrate transporters also form two 
discrete families (NRTl and NRT2), with angiosperms having four NRT2s, on average. NRTls share an evolutionary 
history with peptide transporters (PTRs). The NRTl/PTR family in land plants usually has more than 50 members and 
contains also members with distinct activities, such as glucosinolate and abscisic acid transport. 

Results: Phylogenetic reconstructions of each family across 20 land plant species with available genome sequences 
were supplemented with subcellular localization and transmembrane topology predictions. This revealed that both 
AMT families diverged prior to the separation of bryophytes and vascular plants forming two distinct clans, 
designated as supergroups, each. Ten supergroups were identified for the NRTl/PTR family. It is apparent that 
nitrate and peptide transport within the NRTl/PTR family is polyphyletic, that is, nitrate and/or peptide transport 
likely evolved multiple times within land plants. The NRT2 family separated into two distinct clans early in vascular 
plant evolution. Subsequent duplications occurring prior to the eudicot/monocot separation led to the existence of 
two AMTl, six A1\/1T2, 31 NRTl/PTR, and two NRT2 clans, designated as groups. 

Conclusion: Phylogenetic separation of groups suggests functional divergence within the angiosperms for each 
family. Distinct groups within the NRTl/PTR family appear to separate peptide and nitrate transport activities as well 
as other activities contained within the family, for example nitrite transport. Conversely, distinct activities, such as 
abscisic acid and glucosinolate transport, appear to have recently evolved from nitrate transporters. 

Keywords: Ammonium transporter (AMTl and AMT2), Nitrate transporter (NRTl and NRT2), Peptide transporter 
(PTR), Gene family evolution 



Background 

Nitrogen (N) is the macronutrient required by plants in 
the greatest amounts, yet N is often the most limiting 
nutrient in terrestrial ecosystems, due to its low avail- 
ability in soils [1,2]. In soils, N can exist as organic N, in 
the forms of amino acids, free peptides, and proteins 
[3,4], and as inorganic N, in the forms of nitrate (NO3') 
and ammonium (NH4"^) [4]. Inorganic N is the most 
prominent form of N taken up by many land plant spe- 
cies [5,6]. NH4"^ and NO3' uptake from the soil, as well 
as movement of NH4^ and NO3' throughout the plant, 
is regulated by current N demand for growth and 
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storage, and is largely performed by two groups of ion 
transporter proteins, NH4^ transporters (AMTs) and 
NO3' transporters (NRTs) [4,7]. Each group can be sub- 
divided into two families based on sequence similarity: 
NRTl and NRT2, and AMTl and AMT2. NRTls are 
part of a large family of solute transporters that also in- 
cludes peptide transporters (PTR). 

NRTs are encoded by two distinct gene families (NRTl 
and NRT2) that do not share significant overall sequence 
similarity. Both families perform proton-coupled active 
transport and have 12 putative transmembrane (TM) do- 
mains [5]. The NRT2 family is responsible for the high 
affinity transport system (HATS) of NO3' [8]. The HATS 
is composed of saturable transporters that take up NO3' 
at low rates and high affinity and are expressed under NO3' 
limiting conditions. The HATS has inducible members 
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(iHATS), which are expressed in response to low NO3' con- 
centrations, as well as constitutive members (cHATS), 
which are not N-inducible [9]. Some members of the 
NRT2 family require physical association (protein inter- 
action) with NAR2 (Nitrate Assimilation-Related) proteins 
[10,11] for proper functioning. Interaction with NAR2 pro- 
teins was shown to be necessary in diverse plant lineages, 
including monocots, eudicots, and green algae [9,12]. 

NRTls are responsible for the low affinity transport sys- 
tem (LATS) of NO3'. The LATS contains non-saturable 
transporters that transport NO3" at much higher rates than 
the HATS and are expressed under NO3' abundant condi- 
tions [8]. More than fifty putative members of the family 
have been identified in A. thaliana; however, many of these 
are not NO3' transporters but more likely encode trans- 
porters of other N-containing compounds such as small 
peptides or amino acids [8]. Recently, NRTl/PTR family 
members have also been shown to transport solutes with 
distinct physiological functions, such as the plant hormone 
abscisic acid (ABA) [13] or herbivore-deterring glucosino- 
lates [14]. 

Both AMTls and AMT2s contain 11 putative TM do- 
mains [6,15,16]. The AMTl family largely comprises 
members responsible for high affinity NH4^ transport 
[17]. AMTls are channel-like proteins [18] that act as 
NH4"' uniporters or NH3/H"' cotransporters [19]. AMTl 
and AMT2s do share a distant common evolutionary 
history and the superfamily includes the Rh family of 
ammonium transporters present in green algae, but not 
in land plants. AMTls are more closely related to 
prokaryotic ammonium transporters than they are to 
AMT2s and were likely inherited vertically [16]. In con- 
trast, plant AMT2s (referred to as MEPa in [16]) form a 
sister clan to some fungal proteins from leotiomyceta 
and several horizontal gene transfers are apparent in the 
larger MEP family [16]. In general, the physiological 
roles of AMT2 proteins are less well understood than 
those of AMTl proteins [20,21]. The Lotus japonicus 
LjAMT2-2 is involved in NH3 uptake through mycor- 
rhizal symbiosis [22]. AMT2s do not exist in most green 
algae, but they are present in Mamiellales, although 
these AMT2s do not share a common evolutionary 
origin with land plant AMT2s. McDonald et al [16] 
suggested that land plant AMT2s share a common 
origin with AMT2s from Archaea, while a separate hori- 
zontal gene transfer event from bacteria may have been 
responsible for the AMT2s in Mamiellales, 

Here, we present comprehensive phylogenies recon- 
structing the evolutionary history of the NH4^, NO3' 
and peptide transporter families, AMTl, AMT2, 
NRTl/PTR, and NRT2 across 20 fully sequenced land 
plant (Embryophyta) genomes complemented with two 
green algal {Chlorophyta) species. These phylogenies are 
supplemented with TM domain topology predictions. 



subcellular localization predictions, and in silico expression 
profiling for selected species. All four N transporter families 
appear to be monophyletic in plants. However, all four fam- 
ilies in angiosperms contain members that separated early 
during land plant evolution and that further diverged 
through gene duplications prior to the monocot/eudicot 
split to give rise to evolutionarily and functionally distinct 
groups. This provides the basis to build hypotheses on 
physiological functions of NH4"^, NO3', and peptide trans- 
porters, and suggests a classification system for the trans- 
porter families based on their evolutionary relationships. 

Results and discussion 

Functionally characterized NRTs and AMTs [6,21,23-30] 
were used for BLASTP searches against the annotated 
proteomes derived from 20 land plant genome se- 
quences and this set was complemented with two green 
algal species (both belonging to the Chlorophyceae) 
resulting in a total of more than 1,300 plant protein 
sequences analyzed (Table 1, Additional file 1). Se- 
quences not named beforehand were given letters (e.g. 
PtNRT2-A, MgAMTl-B, etc.) and sequences that had 
been named or functionally characterized to some de- 
gree retained the original name assigned. The AMTl, 
AMT2, and NRT2 transporter classes are encoded by 
comparably small gene families in most plants ranging 
from one to 14 members. In contrast, the NRTl/PTR 
family can have more than 90 members (Table 1). Two 
Chlorophyceae genomes, from Chlamydomonas rein- 
hardtii and Volvox carterU were included as a root to 
the land plants. They contain NRT2 and AMTl family 
members, but not AMT2s. A single NRTl/PTR like se- 
quence is present in V, carterU but not in C. reinhardtii 
(Table 1). When present, green algal and land plant se- 
quences each form sister clades in rooted plant-only 
maximum likelihood phylogenetic reconstructions (part 
A of Figures 1, 2, 3, and 4) suggesting that a single 
NRT2 and AMTl gene was present in the ancestor of 
Viridiplantae, To evaluate if all land plant sequences 
were indeed inherited vertically, we used representative 
members from all major clades in sequence similarity 
searches against GenBank excluding land plant and 
green algal species. If sequences with close homology to 
specific land plant sub-clades would exist outside the 
plant lineage (e.g. because they were transmitted hori- 
zontally) and be present in GenBank, similarity searches 
(such as BLAST) should identify them more readily than 
more distantly related vertically inherited sequences and 
they should be among the most similar hits. Inclusion of 
these non-plant sequences into the phylogeny should 
then place horizontally transmitted sequences within the 
plant clan, while vertically related sequences should 
form a distinct clan outside the whole plant family in 
unrooted phylogenies. In all four cases, all non-plant 
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Table 1 Members from the AMT1, AMT2, NRT1/PTR, and NRT2 gene families analyzed 








Number of members analyzed^ 




Taxonomic group 


Species (abbreviation) 


AMTI 


AIVIT2 


NRT1/PTR 


NRT2 


Eudicot 


Aquilegia coerulea (Ac) 
Arabidopsis lyroto (Al) 


1 

6 


4 
1 


48 
49 


2 
6 




Arab id ops is thoiiono (At) 


5 


1 


51 


6 




Carico papaya (Cp) 


2 


1 


41 


2 




Cucumis sativus (Cs) 


4 


2 


49 


1 




Glycine max (Gm) 


5 


5 


96 


3 




Maniliot esculenta (Me) 


5 


4 


61 


3 




Medicago truncatula (Mt) 


4 


3 


52 


1 




Mimulus guttatus (Mg) 


6 


2 


52 


7 




Populus tfichocarpa (Pt) 


6 


5 


70 


6 




Prunus persica (Prp) 


3 


4 


49 


2 




Ricinus communis (Rc) 


4 


3 


41 


4 




Vitis vinifera (Vv) 


1 


1 


44 


0 


Monocot 


Brachypodium distachyon (Bd) 


2 


6 


67 


5 




Oryza sativa (Os) 


2 


6 


65 


3 




Setaria italic (Si) 


2 


6 


74 


7 




Sorghum bicolor (Sb) 


2 


6 


67 


4 




Zea mays (Zm) 


3 


5 


51 


3 


Lycophyte 


Selaginella moellendorffii (Sm) 


1 


0 


31 


2 


Bryophyte 


Physcomitrella patens (Pp) 


5 


10 


18 


8 


Green algae 


Chlamydomonas reinhardtii (Cr) 


3 


0 


0 


3 




Volvox carteri (Vc) 


6 


0 


1 


3 



Given are the numbers of protein sequences included^ from each of the 22 fully sequenced Viridiplantoe genomes studied. A full list of sequences and respective 
accession numbers are given in Additional file 1. 

^Actual family sizes may be larger, because partial sequences and those leading to distortions in the phylogenies were excluded here. 



sequences formed a single clan (part B of Figures 1, 2, 3, 
and 4) suggesting that indeed all plant sequences are 
monophyletic and were inherited vertically. 

Within the land plants, several gene birth and death 
events apparently occurred throughout the lineage, giving 
rise to a complex mixture of subfamilies. Clades were ini- 
tially characterized as chlorophyte, bryophyte, lycophyte, 
or angiosperm. Within the angiosperms, 'groups' were de- 
fined where there was a single common ancestor between 
a eudicot and a monocot clade. Groups were combined to 
supergroups; if they separated prior to the embryophyte/ 
bryophyte split, i.e. were separated by P. patens sequences. 

The NRT1/PTR family 

The NRTl/PTR family is named after the functions 
of its founding members, namely nitrate transporters 
(NRTl) and peptide transporters (PTR). The gene family 
is comprised of 54 family members on average in land 
plants, ranging from 18 copies in the moss Physcomi- 
trella patens to 96 copies in Glycine max (Table 1). 
Phylogenetic reconstructions covering a total of 1,077 



plant and 24 non-plant sequences separated the family 
into ten supergroups (Figure 1), which were further sep- 
arated into a total of 32 groups (Additional file 2). Be- 
yond NO3' and peptide transport, the functions in this 
family cover a wide range including glucosinolate trans- 
port [14], abscisic acid transport [13], and NO3" excre- 
tion (AtNAXTl) [31]. To test whether these diverse 
plant sequences likely form a monophyletic group in 
plants, representative sequences from multiple groups 
within each supergroup were used in BLAST? searches 
against the GenBank database, excluding Viridiplantae 
sequences and the ten best hits each (lowest expect 
value) were retained. This resulted in a total of 24 proteins 
from diverse eukaryotes including animals, amoeba, and 
heterokonts (Additional file 1) sharing up to 32% overall se- 
quence identity with plant sequences. Many of the animal 
sequences were annotated as SoLute Carrier 15 (SLC15) 
family proteins, and reciprocal BLAST searches using the 
human SLC15A1 protein, which encodes a peptide trans- 
porter [32], against the Phytozome database revealed 
NRTl/PTR family members as best hits. We therefore 
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supergroup D 
NRT1 
GTR 
NAXT 



supergroup A 
AIT 

NRT1 ./""''^ 




supergroup / 
PTR 



supergroup H 



Figure 1 NRT1/PTR phylogeny. Unrooted maximum likelihood phylogenetic reconstruction of the NRTl/PTR families in plants and a set of 24 

non-plant sequences identified as best GenBank BLAST hits using representative members from each supergroup as query. Taxonomic groups are 
colored such that blue refers to eudicots, red to monocots, green to chlorophytes, yellow to bryophytes, and orange to lycophytes. Percent 
Bootstrap values from 1,000 replicates are given for central branches only up to the branches defining supergroups. The approximate location of 
functionally characterized transporters discussed is indicated (NRT nitrate transporter, GTR: glucosinolate transporter, PTR: peptide transporter, 
NiTR: nitrite (NO2') transporter, NAXT: nitrate excretion transporter, and AIT: abscisic acid (ABA) transporter). *Note that supergroup F is paraphyletic; the 
containing clades have been combined owing to poor bootstrap support separating them. For detailed phylogenies of each superfamily and 
group definitions see Additional file 2. For database accession numbers of the 1,101 protein sequences included see Additional file 1. 



opted to name uncharacterized proteins as SLC15 followed 
by a gene identification letter (Additional file 1)^. 

Inclusion of the non-plant sequences into the phyl- 
ogeny placed none of the non-plant sequences within a 
plant clan and all non-plant sequences form a single clan 
in the unrooted phylogeny (Figure IB). This suggests 
that, despite the variety of functions, the NRTl/PTR 
family appears to be monophyletic in plants. 

Targeted analyses of the green algal (C. reinhardtii 
and V, carteri) genomes revealed a single protein from 
V, carteri. The V. carteri protein is located between the 
non-plant clan and the remainder of the Viridiplantae 
(Figure 1). This may suggest that the V, carteri genes 
shares a common ancestry with other Viridiplantae 
NRTl/PTRs. Although the second green alga analyzed 



here (C. reinhardtii) also belonging to the Chlorophyceae 
does not contain a homolog in its annotated genome, 
other green algae, including Chlorella variabilis^ Cocco- 
myxa subellipsoidea (both belonging to the Trebouxio- 
phyceae), and Ostreococcus tauri {Mamiellophyceae) do 
contain NRTl/PTR like sequences with high sequence 
similarity to the V, carteri gene included here (based on 
sequence similarity searches against their annotated pro- 
teomes available at the JGI). 

Within the land plants, it is apparent that both NRT Is 
and PTRs are polyphyletic, i.e. functionally characterized 
NRT Is and PTRs are more closely related to functionally 
distinct proteins than to other proteins with identical 
function (Figure 1). Predicting the ancestral function of 
the family in land plants is difficult with the relative 
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A) 



group I 



predicted localization: 
Endoplasmatic Reticulum 
Golgi Apparatus 
Plasma Membrane 
Cytoplasm 
Peroxisome 




non-plant clan 

Figure 2 Maximum Lil<elihood phylogenetic reconstruction of the NRT2 family. Taxonomic groups are colored as described in Figure 1. A) 
Pliylogram of tine NRT2 family including 81 plant sequences only. Chlorophyte sequences were used to root the tree. Bootstrap values of the ML 
analysis are given and corresponding bootstrap values (>75%) from a distance neighbor joining and a parsimony analysis are indicated as green 
and red stars, respectively (within groups, bootstrap values >75% are displayed as stars only for all three analyses). Subcellular localization 
predictions are indicated as colored boxes framing gene identifiers and transmembrane topology predictions are given as numbers to the right 
of the gene identifier. A classification system is indicated as group labels to the right (E = eudicot, M = monocot). B) Unrooted phylogeny of the 
NRT2 family including twelve non-plant sequences identified as best GenBank BLAST hits using representative members from each plant group. 
Bootstrap values are given for central branches only up to the branch defining a group. Approximate locations of functionally characterized 
members discussed in the text are indicated.For database accession numbers of the 93 protein sequences included, see Additional file 1. 



scarcity of functional data available, but given that the 
homologous SLC15 family in animals transports pep- 
tides and amino acids [32], it may be assumed that this 
is the ancestral function of the NRTl/PTR family. If this 
is true, a minimum of three independent NRT gain of 
functions must be assumed. On the other hand, assum- 
ing an ancestral NRT function would require a mini- 
mum of four PTR gains (or three gains plus one 
reversion back to NRT). But again, this speculative parsi- 
mony argument is based on a very small number of 
functionally characterized proteins in a large tree. 

Despite the polyphyletic characteristics of the family 
and the relative dearth of functionally characterized 
members, it appears possible to define groups, or even 
supergroups that share common functions. 



Supergroup A 

In supergroup A, AtNRTl-2 is quite distantly related to 
other Arabidopsis NRT Is present in the adjacent clan 
(supergroup B), AtNRTl-2 has above average expression 
levels in floral organs and rosette leaves (Figure 5). In con- 
trast, Huang et al [33] reported high levels of AtNRTl-2 
expression in roots, primarily in root hairs and root epi- 
dermis. A transient repression of AtNRTl-2 in response to 
NO3" supply has been observed while AtNRTl-1 (in super- 
group B) expression increases [33]. AtNRTl-2 has NO3' 
transport activity but no peptide transport activity [33] . It 
was a surprising observation that AtNRTl-2 also trans- 
ports the plant hormone abscisic acid (ABA), indeed with 
greater affinity than NO3' and was thus named AITl 
(ABA-Importing Transporter 1) [13]. Kanno et al [13] 
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Figure 3 Maximum Lil^elihood phylogenetic reconstruction of the AMT1 family. Taxonomic groups are colored as in Figure 1. A) Rooted 
pliylogram of tine AMTl family in plants only. Bootstrap values of the ML analysis are given and corresponding bootstrap values (>75%) from a 
distance neighbor joining and a parsimony analysis are indicated as green and red stars, respectively (within groups, bootstrap values >75% are 
displayed as stars only for all three analyses). Subcellular localization predictions, transmembrane topology predictions, and a classification system 
is indicated as described in Figure 2. B) Unrooted phylogeny of the AMTl family including 21 non-plant sequences identified as best GenBank 
BLAST hits using representative members from each plant group. Bootstrap values are given for central branches only up to the branch 
defining a group. 



suggest ABA import activity for three additional members 
of supergroup A. While AtAITl/NRTl-2 and AtAIT2 are 
located in group A -II, AtAITS and AtAIT4 are contained 
in the sister-group A-III (Additional file 2). In total, six 
groups can be distinguished within supergroup A, but the 
only other functionally characterized member of super- 
group A is OsSPl from group A-l (Figure 1). OsSPl is 
located in the plasma membrane and is needed for pan- 
icle elongation in rice [34,35]. Both voltage-clamp and 
yeast/bacteria mutant complementation failed to show 
nitrate transport activity indicating that it transports an 
alternate substrate [34]. Given the ABA transporting 
function of other supergroup A members, this alterna- 
tive substrate could be ABA, consistent with the spl 
phenotype showing a reduction in panicle size [34]. 
If this is the case, supergroup A may be an ABA trans- 
porter clan that may have evolved from an ancestral ni- 
trate transporter. This speculation is congruent with 



the notion that supergroup B, forming an adjacent clan 
to supergroup A, contains four characterized NRTs 
(AtNRTl-1, AtNRTl-3, AtNRTl-4, and MtNRTl-3). 

Supergroup B 

Distinct A. thaliana NRTls define one of the three differ- 
ent groups present in supergroup B and MtNRTl-3 shares 
group BAW with AtNRTl-3 (Figure 1, Additional file 2). 
AtNRTl-1 is largely constitutively expressed and en- 
codes a dual-affinity NO3' transporter, performing both 
low-affinity and high-affinity transport, mediated by 
phosphorylation [36]. AtNRTl-1 has been described as 
a NO3' sensor [37]. It functions in stomatal opening 
[38] and in regulation of AtNRT2-l [39]. MtNRTl-3 
is also a dual affinity transporter and is involved in primary 
root growth, NO3' sensing, and is developmentally regu- 
lated in an N-dependent manner in roots [40]. AtNRTl-3 
has high expression levels in many tissues with highest 
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non-plant clan 




Figure 4 Maximum Lilcelihood pliylogenetic reconstruction of the AIV1T2 family. Taxonomic groups are colored as described in Figure 1. A) 
Pliylogram of tine AMT2 family including plant sequences only supplemented with ML bootstrap values (NJ and parsimony bootstrap values 
(>75%) indicated as blue and red stars), subcellular localization predictions, topology predictions, and proposed classification system. The root of 
this tree was defined through the analysis shown in B). B) Unrooted phylogeny of the AMT2 family including eleven most similar non-plant 
sequences present in GenBank. Color-coding as described above. For database accession numbers of the protein sequences included, see 
Additional file 1. 



expression in late stage flower petals and sepals (Figure 5). 
AtNRTl'3 has inducible expression only in shoots but is 
repressed in roots [30]. AtNRTl-4 (defining group B-ll) 
has NO3" transport activity in Xenopus oocytes [41]. It has 
non-inducible expression in the roots, but may be NO3' 
induced in shoots [30,41]. AtNRTl-4 has highest expres- 
sion levels in leaf tissues based on microarray analyses 
(Figure 5) consistent with its known expression in petioles 
and leave midveins [41]. This, together with the phenotype 
of the Atnrtl-4 mutant, implies an important role in NO3' 
redistribution and homeostasis within the plant [41]. To- 
gether, there is strong evidence that all groups within 
supergroup B contain bona fide nitrate transporters, sug- 
gesting that supergroup B is NRT exclusive. However, 
most members have not been functionally characterized 
leaving the possibility of additional functions. 

Supergroup C 

Supergroup C contains two functionally characterized 
members, both nitrite (NO2') transporters (NiTR; Figure 1). 
AtNiTRl from A, thaliana and CsNiTRl from C. sativus 
each define the two groups present within supergroup C 



(Additional file 2). CsNiTRl (group C-II) mediates NO2' ef- 
flux when expressed in yeast, is localized to chloroplast 
membranes and may load cytosolic NO2' into the chloro- 
plast stroma [42]. AtNiTRl is a member of group C-I, and 
Atnitrl knockout mutants accumulate NO2' in leaves, sug- 
gesting a similar role to CsNiTRl [42]. Together this sug- 
gests that the primary function of supergroup C is nitrite 
rather than nitrate or peptide transport. Supergroup C 
shares a common origin with supergroup D (Figure 1) and 
each supergroup is supported with high bootstrap values. 
However, the relative position of the single P, patens and 
the four S, moellendorffii sequences separating the super- 
groups are not well-resolved, precluding placement to the 
base of either supergroup C or D, The addition of more 
non-vascular and basal vascular species may resolve this 
part of the phylogeny. 

Supergroup D 

Supergroup D highlights the polyphyletic relationship of 
the N-transporting proteins present in the NRTl/PTR 
family. It contains groups characterized as glucosinolate 
transporters (GTR), NO3" excretion carriers (NAXT) and 
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mean centred expression levels [log2(sample/average)l 

Figure 5 In silico expression profile of putative AMTs and NRTs. Microarray data were obtained from The Bio-Array Resource for Plant Biology 

[81] for all family members from A. tholiono, 0. sativo, and P. trichocorpo. Note, that AMT3s and AMT4s are members of the AMT2 family. 
J 



NO3' transporters (NRTl), the latter being separated from 
NRTls in supergroup B by the nitrite transporters present 
in supergroup C (Figure 1). The NO3' excretion (NAXT) 
group D'l forms a basal clan in supergroup D, This group 
contains one characterized NAXT protein, but Segonzac 
et al [31] identified six additional sequences as putative 
NAXTs all present in group D-l, NAXTl is plasma mem- 
brane localized in the cortex of mature roots [31]. NO3' 
efflux capabilities of the NAXT proteins were demon- 
strated in vitro, and RNAi transgenic plants accumulate 
NO3' in roots [31]. As there are no other characterized 
types of transporters present in this clan, group D-l may 
define a NAXT-exclusive group. 

Group D-IV (Additional file 2) is one of only two groups 
in the family containing two different types of transporters: 
the nitrate transporter AtNRTl-9 and the glucosinolate 
transporters AtGTRl and AtGTR2 [14,43], AtGTRl is lo- 
calized to the vascular tissue in leaves and can transport 
4-methylthiobutyl glucosinolate. AtGTRl likely performs 
a role in distributing glucosinolates within the leaf, pos- 
sibly performing an import fijinction into glucosinolate- 
rich cells adjacent to the phloem [14]. AtGTR2 transports 
4-methylthiobutyl glucosinolate at 75% the rate of 
AtGTRl, is localized to veins in leaves, and likely per- 
forms a major role in apoplastic phloem-loading of 



glucosinolates [14]. The two AtGTRs and AtNRTl-9 form 
a closely related group of Arabidopsis paralogs within 
group D-IV (Additional file 2). AtNRTl-9 has highest ex- 
pression in roots and stems (Figure 5) and has NO3" trans- 
port activity in Xenopus oocytes [43]. It is not rapidly 
induced upon NO3' supply; however, expression levels are 
increased over long-term exposure to NO3". AtNRTl-9 is 
plasma membrane localized and expressed in the compa- 
nion cells of phloem in roots [43] . Combined with the ob- 
servation that Atnrtl-9 knockout mutants have reduced 
NO3' concentrations in phloem, this provides strong evi- 
dence that AtNRTl-9 is responsible for phloem loading of 
NO3'. Interestingly, AtNRTl-9 also has minor glucosino- 
late transport activity [14], These data suggest that GTRs 
have evolved recently within the Brassicaceae lineage from 
AtNRTl-9 [14], The relatively long branch length towards 
AtGTRl/GTR2 may support this, although tests for signa- 
tures of positive selection would be needed. 

Within group D-III, AtNRTl-6 has highest expression 
levels in seeds (Figure 5). AtNRTl-6 is not responsible 
for root uptake of NO3' as it is only expressed in repro- 
ductive tissues [24]. AtNRTl-6 confers low-affinity NO3' 
transport and it has been suggested that AtNRTl-6 plays 
a role in transporting NO3' from maternal tissue to the 
developing embryo [24]. AtNRTl-7 is also a confirmed low 
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affinity NO3' transporter expressed primarily in leaves, in 
particular minor veins [44], and floral organs (Figure 5) 
suggesting a role in phloem loading of NO3". 

Supergroup E and supergroup J 

Both supergroups E and J are comparatively small and con- 
tain no functionally characterized members. Supergroup E 
contains a single copy from most species analyzed and is 
rooted by a single P. patens and two S, moellendorffii 
sequences (Additional file 2) suggesting that it has an essen- 
tial function in plants. In contrast, supergroup J contains 
exclusively eudicot sequences and only seven out of 13 
eudicot species maintain members in this supergroup. It 
appears supergroup J members have evolved specialized 
functions in species in which they are maintained. 

Supergroup F and G 

A well-supported clan contains a clan comprising aU 
non-plant sequences, the green algal sequence from 
V, carteri, and several angiosperm clans. Among them, 
supergroup G is supported with strong bootstrap sup- 
port and is rooted by multiple lycopod and bryophyte 
sequences (Figure 1). The remaining angiosperm se- 
quences form multiple clans, many of which lack 
bootstrap support and are not maintained in the sub- 
phylogeny containing only plant sequences (Additional 
file 2). This prevents defining separation times and we 
therefore refrained from defining additional supergroups 
and instead combined these sequences into a single, 
paraphyletic supergroup. This supergroup F contains 
mainly PTRs but also a characterized NRT and, together 
with supergroup G NRT Is, is the clan most closely re- 
lated to the non-plant sequences included (Figure 1). 
This supergroup contains three groups, two of which 
{F'l and F-III) are weU supported. The third (F-II) con- 
tains one eudicot clan and four monocot clans 
(Additional file 2). This clan also contains the sole green 
algal and a single lycopod sequence, and a monocot clan 
adjacent to the non-plant sequences in the unrooted 
phylogeny (Figure 1). However, very poor bootstrap sup- 
port separating the monocot clans precludes defining if 
they separated before or after the monocot/eudicot split 
and thus they were all grouped together into the para- 
phyletic group F'llM here. At least eight gene duplica- 
tions must be assumed prior to separation of the 
monocot species analyzed (Additional file 2). The two 
characterized members in group F-IIM are OsPTR6 
and OsNRTl from O. sativa, OsPTR6 confers peptide 
transport as it is capable of transporting Gly-His and 
Gly-His-Gly di- and tri-peptides [45]. Also the sole char- 
acterized member in the eudicot F-llE group is a peptide 
transporter. AtPTR2 was the first di-/tripeptide trans- 
porter identified in Arabidopsis [46-49] with high mRNA 
expression levels in germinating seed, root, and young 



leaf tissues [47]. AtPTR2 antisense transgenic plants 
displayed delayed flowering time and arrested seed de- 
velopment [48,50]. In contrast, the third characterized 
member of the F-ll group encodes a NO3' transporter. 
OsNRTl is constitutively expressed in the root epider- 
mis and in root hairs [51]. Like most other NRT Is, 
OsNRTl is a low- affinity transporter [35,51]. Despite 
the fact that OsNRTl is more closely related to PTRs 
than to other NRTls (Figure 1), Lin et al [51] observed 
no peptide transport. This is another example of trans- 
porters with distinct functions sharing the same group. 
In contrast, group F-l (Additional file 2) appears to be a 
PTR-exclusive group. AtPTRl transports di-/tripeptides 
with low selectivity, is expressed in vascular tissue 
throughout the plant, and likely performs a role in long 
distance transport [49]. AtPTR5 mediates high-affinity 
transport of dipeptides and likely supplies peptides to 
maturing poflen, developing ovules, and seeds [52]. 
Interestingly, overexpression of AtPTRS resulted in en- 
hanced shoot growth and increased N content [52]. 

Supergroup G is nested within the larger, unresolved Gl 
F/ non-plant clan, but is separated by bryophyte sequences 
and is supported by high bootstrap support across aU 
three methods employed. The clan has therefore been 
designated a separate supergroup. Supergroup F appears 
to be NRT-exclusive (Figure 1). The defining members, 
AtNRTl-S and AtNRTl-8 have above average expression 
levels in seeds, but AtNRTl-S also has relatively high 
expression levels in roots, flowers and senescing leaves 
(Figure 5). AtNRTl-S, like AtNRTl-1 (in supergroup B), 
is NO3' inducible and strongly expressed in roots; how- 
ever the response of AtNRTl-S to NO3" supply is much 
slower than that of AtNRTl-1 [53]. Both AtNRTl-5 and 
AtNRTl-8 have confirmed NO3 transport activity [53,54]. 
AtNRTl-8 has high expression levels in xylem paren- 
chyma cells in the stele and Atnrtl-8 mutants have in- 
creased concentrations of NO3' in the xylem, suggesting 
that AtNRTl-8 is responsible for removing NO3' from the 
xylem [54]. Thus, AtNRTl-5 and AtNRTl-8 appear to 
define a group {D-l E) responsible for movement of NO3' 
within the plant. None of the other groups contain 
characterized members precluding judgment of functional 
diversity within superfamily G. 

Supergroup H and supergroup I 

Supergroups H and I are adjacent clans containing only 
one functionally characterized member, AtPTR3 (in 
supergroup I). AtPTRS is induced in vegetative tissues 
by histidine, leucine and phenylalanine and upon salt 
stress. Germination frequency of ptrS mutants was 
reduced on salt-containing media [55]. AtPTRS is also 
induced upon mechanical wounding, and ptrS mutants 
have increased susceptibility to virulent pathogenic bac- 
teria suggesting that AtPTR3 has a general role in stress 
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response [55]. Supergroup I may be PTR specific; how- 
ever, basing this hypothesis on only one functionally 
characterized protein is premature. Notably, supergroup 
H is one of the largest supergroups within the NRTl/ 
PTR family but lacks any characterized members, highlight- 
ing the need for further functional characterizations. 

In summary, it appears that within the NRTl/PTR fam- 
ily, evolutionary distinct clans define functionally distinct 
groups in many, but notably not all cases. Our classifica- 
tion is largely in agreement with Plett et al [56], who 
analyzed the NRTl family in select monocots and eudi- 
cots: Supergroup A (containing AtNRTl-2), supergroup 
B (containing AtNRTl-1, AtNRTl-3, and AtNRTl-4), 
supergroup D (containing AtNRTl-6 and AtNRTl-7), and 
supergroup G (containing AtNRTl-5 and AtNRTl-8) were 
also indicated [56]. The additional supergroups defined 
here arise from the large number of species analyzed, and 
because Plett et al [56] intentionally did not include char- 
acterized PTRs. 



The NRT2 family 

The NRT2 gene family is comprised of one to eight fam- 
ily members with 11 or 12 predicted TM domains. The 
majority of NRT2 family members are predicted to be 
localized to the peroxisome or cytoplasm (Figure 2). It 
is important to note that localization prediction of 
hydrophobic membrane-spanning proteins is particularly 
challenging. Especially cytoplasmic localization may be 
interpreted with caution as this is based on the apparent 
absence of a localization signal. Based on the trans- 
membrane structure prediction, all these proteins are 
obviously targeted to a membrane. Phylogenetic recon- 
structions suggest two distinct clans of NRT2s in angio- 
sperms each containing a single group proper (Figure 2). 
Also Plett et al [56] noted separation of the NRT2 family 
prior to the separation of monocots and eudicots. Inclusion 
of bryophyte and lycopod sequences suggests that these 
groups separated early in vascular plant evolution, but after 
the separation of bryophytes and tracheophytes. Therefore, 
we defined them as different groups, but not as different 
supergroups, which would require separation prior to the 
bryophyte/tracheophyte split. However, bootstrap support 
for this placement is low and conflicting results were ob- 
tained with alternative methods (distance and parsimony). 
Thus, the separation time of group I and group II remains 
unresolved, but clearly happened early in embryophyte or 
tracheophyte evolution. 

Inclusion of non-plant homologs (from the red alga 
Pyropia yezoensis, the heterokont brown alga Ectocarpus 
siliculosus, and ten bacterial sequences) resulted in a sin- 
gle non-plant clan separated by a long branch from the 
green algal and land plant sequences (Figure 2B). This is 
consistent with a monophyletic origin of Viridiplantae 



NRT2s shown earlier [57,58]. We thus rooted the plant 
only phylogeny (Figure 2 A) with the green algal clan. 

The bryophyte clade contains all P, patens NRT2s and 
is basal to the vascular plant clade. In the bryophyte 
clade, multiple gene duplications led to a total of eight 
P, patens NRT2s. In the lycophyte clade, there has been 
one duplication event leading to two copies of NRT2 in 
S, moellendorffii. None of these genes have been func- 
tionally characterized. 

As there are only a few angiosperm species repre- 
sented in group II (eight out of the 20 land plant species 
analyzed), it appears that gene loss in this group is com- 
mon. There are both monocots and eudicots present in 
group II, suggesting that, contrary to AMTl and AMT2 
(described below), the function performed by group II 
NRT2s is shared across eudicots and monocots. Within 
this group, AtNRT2'5 has maximal expression in senes- 
cing leaves based on microarray data (Figure 5), but is 
described as being nitrate repressible, expressed in roots 
and shoots, and, contrary to root-uptake NRT2s, having 
maximum expression in the absence of NO3' [30]. This 
may indicate a function in remobilization of NO3' from 
stored pools [59]. While not functionally characterized, 
the expression profile for PtNRT2-X (group II) shows 
maximal expression in male catkins. Together, this may 
suggest that eudicot group II NRT2s fulfill functions in 
NO3' remobilization within the plant rather than having 
root uptake activities. In contrast, the rice OsNRT2-l (in 
group II M) has maximal expression in seedling roots, is 
NH4^ repressible, and, as found for typical root-uptake 
NRT2s, is up-regulated in response to low concentra- 
tions of NO3" [60]. This supergroup thus contains bona 
fide NRTs that may be responsible for both root uptake 
and within plant mobilization of NO3'; however more 
functional data is required to separate these functions in 
an evolutionary context. 

The distinct group I contains members from all angio- 
sperm species analyzed. Consistent with Slot et al. [58], 
who included only group I angiosperm sequences, 
monophyly of the group and the monocot and eudicot 
sister clades within the group are well supported (Figure 2). 
Group I E has undergone extensive gene amplification in 
all angiosperms analyzed except in M. truncatula, which 
contains only one NRT2. Within group I £, two sets of 
Arabidopsis paralogs can be identified: one comprising 
AtNRT2-l, -2, and -4, and the other containing AtNRT2-3 
and -6, each alongside their A, lyrata orthologs (Figure 2). 
Obvious differences in expression patterns (Figure 5), both 
within and between these groups suggest different func- 
tions of individual members. AtNRT2-l and AtNRT2-2 are 
responsible for HATS transport in roots [61,62]. AtNRT2-l 
is induced upon supply with low levels of NO3' and also 
directly regulates lateral root formation under N-limiting 
conditions [63]. A knockout mutant of either AtNRT2-l or 
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AtNRT2-2 results in a reduction of iHATS; however, a 
double knockout is required to reduce the cHATS, sugges- 
ting partially redundant functions of these paralogs [62]. 
Furthermore, AtNRT2-4 is over-expressed in Atnrt2-1/ 
Atnrt2-2 double knockouts, suggesting compensation; but 
over-expression of AtNRT2-4 does not restore NO3" trans- 
port in the double mutant [59]. Thus, it is likely that 
AtNRT2-4 is responsible for the re-mobilization of stored 
NO3', but it cannot replace root uptake functions of 
AtNRT2-l and -2. 

AtNRT2-3 and AtNRT2-6 are less well characterized 
but AtNRT2-3 expression is slightly responsive to NO3' 
supply both in shoots and roots [30] and AtNRT2-6 has 
very low overall expression with highest expression in 
roots. AtNRT2-6 may be responsible for very high affin- 
ity NO3' transport [59]. In conclusion, individual group I 
E members in A. thaliana fulfill distinct physiolo- 
gical functions and even close paralogs (e.g. AtNRT2-l 
and -2) have only partially redundant functions. It may 
thus be expected that functional radiation' within this 
group has led to the maintenance of multiple copies 
with diverse functions in other species, also. 

The monocot sister clade, group IM, has undergone 
multiple gene duplications at different times in its evolu- 
tionary history (Figure 2). The majority of group I M 
proteins are predicted to be localized to the cytoplasm, 
but five proteins are predicted to be localized to the 
secretory pathway, one to the golgi and four to the ER 
(Figure 2). Both rice NRT2s, OsNRT2-2 and OsNRT2-3, 
have maximal expression in seedling roots (Figure 5) 
and OsNRT2-2 is rapidly induced in roots upon NO3' 
supply, then down-regulated quickly [25]. OsNRT2-3 has 
a transient inducible response to NO3', and, Yan et al 
[64] have shown that OsNRT2-l, OsNRT2-2, and 
OsNRT2-3, like the NRT2s in A, thaliana, must interact 
with an additional protein, OsNAR2-l, to perform trans- 
port activity. 

The AMT1 family 

The AMTl gene family is comprised of 1-7 family 
members with either 11 or 12 predicted TM domains. 
Most AMTl family members are predicted to be local- 
ized to the secretory pathway, namely endoplasmic 
reticulum (ER) or golgi apparatus (Figure 3). Phylogenetic 
reconstructions suggest the existence of two evolutionarily 
distinct clans of AMTl members in angiosperms (super- 
groups A and 5), which are separated by sequences from 
lycophytes and bryophytes (Figure 3). All P. patens se- 
quences and the sole S. moellendorffii sequence form a 
monophyletic group at the base of supergroup A, 

Two paraphyletic clans are apparent in the green algae 
and in each clan gene duplications prior to speciation 
generated the copies present in V, carteri and C. rein- 
hardtii. No functional data are available for any of these 



green algal AMTl transporters. McDonald et al. [16] 
and McDonald et al. [57] provide extensive evolutionary 
analyses of the AMTl family across multiple green algal 
and other eukaryotic lineages. Both identified a single 
land plant clade rooted by green algal clades. Here, we 
expanded land plant coverage and identified a new di- 
vergent land plant clade clan (supergroup B). In order 
to validate that also supergroup B sequences are part of 
the same land-plant clade identified previously [16,57], 
we used representative members of each group to iden- 
tify the ten most similar non-plant members present 
in GenBank. This resulted in an overlapping set of 18 
sequences from bacteria, heterokonts, cryptophytes, 
amoeba, and rhodophytes (Additional file 1). Upon in- 
clusion of the non-plant sequences into the phylogeny, 
all land plant AMTls remained in a single clan sepa- 
rated by green algal sequences from the non-plant clan 
(Figure 3B) suggesting that all plant AMTls analyzed 
here were inherited vertically from a common ancestor 
and are part of the single land plant clan identified by 
McDonald et al [16]. Thus, supergroup A AMTls most 
likely separated from supergroup B AMTls prior to the 
bryophyte/embryophyte split, but after separation of 
land plants and green algae. 

We rooted the plant only phylogeny (Figure 3A) with 
green algal sequences. No bryophyte, lycophyte, or mono- 
cot homologs to the eudicot B-l E members are present 
in extant species analyzed. Also, most eudicot species 
analyzed here do not contain supergroup B members, 
including all species within the Brassicales, suggesting 
a specialized function of these ancient AMTls in those 
species that maintained them. Biochemical functional 
characterization of any supergroup B AMTls is lacking, 
but PtAMTl'6 expression is increased upon ectomycor- 
rhizal symbiosis [6]. This may indicate that members of 
this supergroup B perform symbiont-related transport, 
such as NH4^ uptake from mycorrhizae. 

Within supergroup A, a single, well supported mono- 
cot clade and multiple eudicot clades are apparent. Boot- 
strap support for deciphering the relationship between 
these eudicot clades and the single monocot clade is 
missing. Therefore it cannot be determined whether the 
clades separated prior of after the separation of monocot 
and eudicots. For this reason, all eudicot sequences 
within supergroup A were combined in a single group, 
named A-I£. Extending taxonomic depth may be neces- 
sary to determine if some of these clades actually define 
separate groups. Most A-l AMTls are predicted to be 
ER localized, but eight proteins may be golgi apparatus 
localized and one has a peroxisome prediction. Both 
clades can each be further divided into subclades that 
contain members from all species in the respective 
group. Subsequent duplications are apparent in both lin- 
eages, but the group is particularly expanded in eudicots 
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(Figure 3) where at least three subclades exist. Additional 
recent duplications gave rise to groups of paralogs in mul- 
tiple species including A. thaliana. Thus, the vast majority 
of AMTls amplified separately in eudicots and monocots 
suggesting relatively recent functional diversification. 

The two rice AMTls included belong to group AA 
M, Also the third rice AMT (OsAMTl-2) groups within 
this clade, but was excluded from the final analyses due 
to a very long branch towards OsAMTl-2 that has the 
potential to distort the overall topology of the phylogeny. 
Despite this, all three encode ammonium transporters 
validated by complementing an ammonium uptake- 
deficient yeast strain [65]. All show maximal transcript 
levels in seedling roots (Figure 5). Both OsAMTl-1 and 
OsAMTl'2 are ammonium induced in N-starved plants 
[65], and are repressed by transfer from low to high am- 
monium, which correlates with high affinity NH4^ up- 
take [29]. Kumar et al [29] showed that OsAMTl-3 (in 
a separate A-l M subclade) transcript levels remained 
largely unaffected by such treatments suggesting func- 
tional divergence within the monocot A-l M subclades. 

The paralog group in Arabidopsis contains the well- 
characterized A. thaliana AtAMTl-1, AtAMTl-3 and 
AtAMTl-5, and five A. lyrata orthologs. AtAMTl-1 is 
expressed in the rhizoderm and root cortex [28] while 
AtAMTl-2 is expressed in root endodermal cells [66]. 
AtAMTl-1 and AtAMTl-3 are plasma membrane local- 
ized, and all three are high affinity NH4^ transporter 
proteins and involved in root-uptake of NH4^ in an addi- 
tive manner [28,67]. While AtAMTl-3 and AtAMTl-5 
are root specific, AtAMTl-1 is expressed more broadly, 
including roots, leaves, and sepals (Figure 5). In addition 
to its function in root NH4^ uptake, AtAMTl-3 has a 
regulatory function in NH4^-induced lateral root branch- 
ing [68]. In contrast, the more diverse AtAMTl-4 (in a 
distinct clade in group A-l E) is not expressed in roots, 
but is specifically expressed in pollen. It also encodes a 
plasma membrane localized high-affinity NH4"^ trans- 
porter [23]. The P, trichocarpa AMTls sharing a clade 
with AtAMTl-4 {PtAMTl-4 and PtAMTl-5) are also 
expressed in male and female flowers, and, in the case of 
PtAMTl-4, in leaves [69] (Figure 2). Taken together, this 
suggests distinct physiological functions for A-l E sub- 
clades in root uptake or reproductive organ supply of 
ammonia. 

The fifth A. thaliana AMTl family member, AtAMTl-2, 
belongs to the third A-\ E clade (Figure 3) and en- 
codes a transporter that mainly contributes to the 
HATS. It is expressed in young root endodermal cells 
and more mature cortical cells, but is not induced by 
low nitrogen availability [19,67]. In addition to roots, 
AtAMTl-2 is also expressed in flowers and stem nodes 
with maximal expression in cauline leaves (Figure 5). 
Likewise, its P. trichocarpa ortholog, PtAMTl-2, has 



high levels of expression in roots [69], but also in other 
tissues such as seedlings grown in continuous light and 
male catkins (Figure 5). PtAMTl-2 is induced by ecto- 
mycorrhizal symbionts together with PtAMTl-4 and 
PtAMTl-6 (named PtAMTl-3 in Selle et al [69], but 
PtAMTl-6 in Couturier et al [6] and at 'Phytozome'). 
The P. tremula x tremuloides ortholog of PtAMTl-2 en- 
codes a high affinity transporter with similar expression 
patterns [69]. Together, this may suggest broader func- 
tions for members of this clade in within-plant and 
plant-symbiont ammonium distribution rather than high 
affinity uptake from the soil. 

In summary, it is obvious that in both eudicots and 
monocots, early gene duplication events generated the 
supergroup A AMTl subclades. Expression profile and 
physiological differences of subclade members indicate 
functional diversifications in individual species, but more 
detailed information from more species is necessary to 
generalize functional diversifications to the subgroups 
identified. This is especially true for the more divergent 
supergroup B members. 

The AMT2 family 

The AMT2 gene family is comprised of 1-10 family 
members with the vast majority predicted to possess 11 
TM domains (Figure 4). Unrooted phylogenetic recon- 
structions suggest there are two major clans of AMT2s 
in angiosperms forming supergroups. All P, patens se- 
quences form a single bryophyte clan (Figure 4B). It ap- 
pears that many duplication events occurred in the 
bryophyte lineage, both ancient and more recent, leading 
to ten copies of AMT2 in P, patens. Green algal ge- 
nomes analyzed here (both belonging to the Chlorophy- 
ceae) do not contain genes with sequence similarity to 
AMT2s (Table 1). Extended sequence similarity searches 
targeting the Chlorophyta returned exclusively sequences 
from Mamiellales, McDonald et al [57] focused on 
these AMT2s present in Mamiellales and showed that 
they do not share an immediate evolutionary history 
with land plant AMT2s. Instead, land plant AMT2s 
likely arose from a horizontal gene transfer (HGT) 
event [16,57]. Our BLAST searches using representative 
members from all groups against the GenBank database 
excluding Viridiplantae revealed several bacterial species 
including the extremophile chemoautotrophic bacterium 
Leptospirillum rubarum and the chemolithotroph Acid- 
ithiobacillus caldus as most similar sequences (Additional 
file 1). McDonald et al [57] and McDonald et al [16] 
identified the same bacterial species as intermediate 
between typical Archaea AMT2s and land plant AMT2s 
(referred to as MEPa in [16]). Most other bacterial ge- 
nomes lack this type of AMT2, thus it has been argued 
that the land plant AMT2 likely arose through a HGT 
event from a member of Archaea possibly via a gamma 
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proteobacterial intermediate host [16,57]. The same host 
is also the likely origin of AMT2 (MEPa) sequences from 
fungi found only in the leotiomyceta [16]. Consistently, 
proteobacterial and fungal sequences were among the best 
BLAST hits, and this group of non-plant sequences 
formed a separate clan when included into the phylogeny 
(Figure 4B). This confirms that also the extended set of 
plant sequences used here belong to the same monophy- 
letic land plant clade identified earlier [16,57]. 

Within the plant clan, angiosperm sequences form two 
clans separated by a bryophyte clan, suggesting that the 
common ancestor of bryophytes and angiosperms pos- 
sessed two AMT2 genes and that one copy was lost in 
the bryophytes. We thus placed the root of the plant 
phylogeny between supergroup B and the bryophytes 
(Figure 4A). The ancient separation of supergroups A 
and B AMT2s in angiosperms, together with the fact 
that both were maintained in all angiosperm species ana- 
lyzed, clearly suggests a functional difference. However, 
in depth characterizations of AMT2s are scarce. 

Supergroup A contains three groups, one of which 
(group A'lllM) was maintained in monocots only. How- 
ever, group A-IILM contains two distinct clades and 
bootstrap support for them forming a monophyletic 
group together is poor (Figure 4A). Extending species 
depth may thus split this group into two separate monocot 
groups. None of the group A-IIIM members has been func- 
tionally characterized. OsAMT3-3 and OsAMT3-2 (in dis- 
tinct group A-III M clades) have high expression levels in 
the seedling root; however, OsAMTS-S (A-III) is also 
expressed to high levels during early seed development 
(Figure 5). 

Group A'l members, present in monocots and eudi- 
cots, are largely predicted to be localized to the endo- 
membrane system. Recent duplications gave rise to 
paralogs present in P, trichocarpa, M, esculenta, and G. 
max. The remainder of the species contain only a single 
type A'l AMT2 gene (Figure 4). Of the eudicots in 
group A'l, PtAMT2-l has nearly exclusive expression in 
roots and encodes confirmed ammonium transport ac- 
tivity, shown through complementation of MEP (MEthy- 
lammonium transPorter) deficient yeast [6]. PtAMT2-2 
has also been shown to have NH4 transport activity 
as well as detectable expression in roots [6] in addition 
to high expression in male catkins (Figure 5). The 
sole AMT2 gene in A, thaliana {AtAMT2-l) belongs to 
group A'l E and has maximal expression in the stem in- 
ternodes as well as notable expression levels in leaves 
and flowers, based on published microarrays (Figure 5). 
Sohlenkamp et al [70] also noted expression in roots. 
AtAMT2-l has ammonium transport activity similar to 
that of AtAMTl-1 at pH 7.5, but transport capacity of 
AtAMT2-l is an order of magnitude lower than that of 
AtAMTl-1 at pH 6.5 [70]. 



The monocot group A'l appears to have undergone an 
early duplication event preceding speciation in the 
monocots, leading to two sister clades (Figure 4A). 
OsAMT2-l is expressed fairly broadly in both roots and 
shoots [71] (Figure 5). OsAMT2-l has NH4"' transport 
activity in yeast complementation tests, at least at high 
N concentrations [71]. OsAMT2-3 appears more specif- 
ically expressed during inflorescence development and 
late stages of seed development, while OsAMT2-2 shows 
high expression in the seedling root (Figure 5). OsAMT2-2 
transcripts are induced upon supply of NH4"^ [27], and have 
maximal expression levels in seedling roots, suggesting a 
role for OsAMT2-2 in NH4"' uptake from the soil. 

In eudicots, there is either only a single copy A'll 
AMT2 present, or type A'll sequences are absent from 
eudicot genomes, as is the case in A, thaliana. Func- 
tional characterization of group III AMT2s is limited to 
PtAMTS-l, which has maximal expression in male cat- 
kins and has virtually no expression in roots (Figure 5). 
PtAMTS-l is induced during senescence, but whether it 
functions as an NH4"^ transporter remains unclear, as the 
gene is unable to complement MEP deficient yeast [6]. 
Of the monocot genes in group A-II, OsAMT3-l has ex- 
pression in many tissues, but highest expression in 
seeds. Generally, OsAMT3-l has much lower expression 
levels than OsAMT2-l (in group A'l) both in in roots 
and shoots [71]. 

In summary, the only biochemically characterized 
AMT2s reside in group A'l and given the lack of MEP 
complementation and/or the lack of functional analyses 
for members from any other group, it remains to be 
shown that transporters of the other groups are indeed 
NH4 transporters, or instead transport other solutes. 

Conclusion 

We here provide a comprehensive evolutionary view of 
ammonium, nitrate, and peptide transporter families 
across a large number of land plant species. This enables 
a phylogentic classification of each family and affords a 
foundation for further functional characterization. Given 
the depth of species coverage, it can be assumed that 
most, if not all, groups of N-transporters in angiosperms 
have been defined. All four families of N-transporters 
appear to be inherited vertically within the land plants, 
although evolutionary distinct, sometimes small and 
lineage specific groups are obvious, suggestive of lateral 
gene transfer. These lineage specific groups likely sepa- 
rated prior to the bryophyte/tracheophyte split and were 
maintained only in select species suggesting specialized 
functions. McDonald et al, [57] also suggested mono- 
phyly of the land plant clades for all four N-transporter 
families, but given the broader scope of their study, it 
did not aim to resolve the evolutionary history within 
land plants. However, at least two HGT events within 
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green algal AMT2s were evident in that study, one of 
which led to the monophyletic AMT2s in land plants. 

Early divergence and extensive amplification is particu- 
larly obvious in the NRTl/PTR family. Ten supergroups 
were defined that separated prior to the bryophyte/tracheo- 
phyte split and subsequently underwent duplications, giving 
rise to at least 32 groups that separated prior to the mono- 
cot/eudicot split. This is paralleled with functional diver- 
gence in this family with four known substrates being 
transported, namely nitrate, peptides, abscisic acid, and glu- 
cosinolates. The most similar non-plant sequences encom- 
pass the solute carrier family 15 (SLC15) of animals. Given 
that SLC15 proteins are peptide transporters [32], it ap- 
pears plausible that the ancestral function of the plant fam- 
ily was transport of organic N-containing solutes. Thus, it 
is obvious that nitrate transport activity is polyphyletic and 
evolved several times independently within the NTRl/PTR 
family. There appear to be multiple cases in which func- 
tional labels can be applied to groups proper within each 
supergroup. However, these functional assignments can 
only be tentative, given the paucity of functionally charac- 
terized proteins relative to the abundance of sequences 
analyzed, and given that NRTls within at least two super- 
groups have clearly evolved to transport distinct substrates: 
N-containing glucosinolates and N-free isoprenoids, ie 
abscisic acid [13,14]. 

Subcellular localization predictions largely support the 
notion of functional divergence among discrete groups and 
subgroups. Functional information can be inferred from 
localizations, for instance, the extensive endomembrane 
system (golgi, ER, and plasma membrane) prediction in the 
AMT2 family may indicate primary localization to the 
plasma membrane, suggesting cellular uptake rater than 
intracellular compartmentalization of ammonium. How- 
ever, some of the localization is unexpected, such as the 
high degree of peroxisome localization in the NRT2 family. 
This could be due to difficulties in predicting hydrophobic, 
membrane bound proteins, but also to the immaturity of 
proteome annotations, many of which are based on ge- 
nomes recently released; but distinct functions in unex- 
pected organelles should not be precluded. 

Currently, no systematic nomenclature of the NO3" and 
NH4"^ transporters exists. Here, we suggest a naming sys- 
tem that pertains to group membership, defined as being 
derived from a single gene present in the last common an- 
cestor of monocots and eudicots. This simple rule allows 
for easy addition of future sequences to groups, and for- 
mation of new groups, should the need arise. 

Given the depth of angiosperm sequences available, we 
were able to dissect this taxonomic group comprehen- 
sively. However, it is apparent from the inclusion of 
P. patens and S, moellendorffii that a similar diversity 
also exists in non-seed plants, and that inclusion of add- 
itional taxa in these groups and other taxonomic groups. 



from ferns to gymnosperms, is necessary to assess the 
full evolutionary history of the N-transporting systems 
in all plants. 

Methods 

Sequence acquisition 

Individual sequences and accession numbers from func- 
tionally characterized NRTs and AMTs were obtained 
through primary literature research [6,21,23-30]. These 
protein sequences were used in BLASTP searches 
against the A, thaliana, P. trichocarpa, O. sativa, and 
Zea mays proteome annotations using Phytozome [72] 
(http://www.phytozome.net/) and Genbank (http://www. 
ncbi.nlm.nih.gov/genbank). Sequences obtained from the 
initial BLAST searches were then used as query se- 
quences against all organisms present on Phytozome as 
of January 10th, 2011. Over 1,300 sequences in total 
were obtained and are summarized in Additional file 1. 
Sequences that are putative transporters are given letters 
(PtNRTl-A, PtNRTl-B, etc.) and sequences that are 
functionally characterized to some degree retain the 
name they were given in the paper in which they were 
identified. The protein BLAST algorithm parameters 
used were BLOSUM62 comparison matrix, default word 
length of 3, allow gaps (existence cost of 11 and exten- 
sion cost of 1), and included a filter of low complexity 
regions. Sequences were accepted from BLAST results 
as long as they were not a series of small fragments, 
shared at least 30% identity, and had an expect threshold 
lower than le'^^. 

Alignments and phylogeny construction 

The sequences for each family were aligned using 
DiAlign [73] using the Mobyle Portal [74]. The DiAlign 
program provides a scoring system based on local simi- 
larity of aligned blocks that indicates the alignment qual- 
ity at each position. We excluded all positions that had a 
diagonal similarity of <40%. 

Approximate maximum-likelihood phylogenetic re- 
constructions were generated using FastTree version 2.1 
[75]. These phytogenetic reconstructions were generated 
based on the Jones -Taylor-Thornton model; models 
available in FastTree were evaluated using ProtTest [76]. 

Phylips SEQBOOT was used to generate resampled 
alignments, and phytogenetic reconstructions were gen- 
erated for 1,000 replicates. Bootstrap values were then 
mapped to each node in the original phytogenetic recon- 
struction as fraction of times that split is maintained in 
the resampled tree. Supporting phylogenetic reconstruc- 
tions using distance and parsimony methods were gener- 
ated using the Phylip package [77]. Neighbor-joining 
trees were generated based on distance matrices using the 
Jones -Taylor-Thornton model. The resampling method was 
bootstrapping and consisted of 1,000 replicates. Phylogenies 
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were visualized and rooted in FigTree [78] using green 
algae or P. patens sequences. 

While analyzing initial phylograms, any sequences 
with especially long branches were investigated in the 
original alignment. If the sequence had large gaps ex- 
ceeding more than 30% of the alignment length, or con- 
tained areas of extensive differences throughout the 
sequence (likely indicating a gene modeling artifact), it 
was excluded and the remaining sequences were re- 
aligned and new phylogenies reconstructed. OsAMTl-2 
was the only functionally characterized sequence in this 
category and was excluded from the phylogeny. Boot- 
strap values were reported within groups (as color coded 
stars for the three methods) if the support was higher 
than 75%. Maximum Likelihood bootstrap values were 
reported on all branches outside groups and were sup- 
plemented with colored stars, when the same topology 
was also supported by distance and/or parsimony ana- 
lyses. To gather evidence for monophyly of the families 
within plants, sequences from each group of each family 
were used as probes in additional BLAST searches 
against the GenBank database (excluding Viridiplantae 
sequences) to identify publically available non-plant se- 
quences sharing sequence similarity with the baits. The 
ten top BLAST hits (lowest expect value) were retained. 
These sequences were added to the respective sequence 
collection, the family was re-aligned and new phylogenies 
were reconstructed. If the non-Viridiplantae sequences 
formed a separate clan, the plant family was considered 
monophyletic. 

Expression profiling, transmembrane predictions, 
subcellular predictions 

Subcellular localization predictions were performed using 
MultiLoc2 [79] with the MultiLoc2-HighRes (Plant), 10 
Locations algorithm. All predictions were recorded, but 
only the highest probability prediction was reported in the 
final figures. TM domain predictions were performed 
using TopCons [80] with no restrainment options se- 
lected. The TopCons website reports on several topology 
prediction programs' results in addition to the TopCons- 
exclusive prediction, but only the TopCons-exclusive pre- 
diction was recorded here. In silico expression profiling 
(heatmapping) was performed using the Bio-Array for 
Plant Biology (BAR) eFP (electronic fluorescent picto- 
graph) browser, which is based on re-normalized Affyme- 
trix® microarray expression data published previously 
[81,82]. Tissue and organ gene expression data for each 
gene were retrieved from the respective eFP browser site 
and compiled into a data table. This was used to generate 
heatmaps where colour coding was used to visualize ex- 
pression levels. These visualizations were performed using 
Microsoft Excel. Organisms analyzed include A, thaliana, 
P. trichocarpa, and O. sativa. 



Supporting data 

The data sets supporting the results of this article are 
included within the article and its additional files. 
Accession numbers and sequences of proteins included 
are given in Additional file 1. The alignment and tree 
files presented have been submitted to TreeBASE (acces- 
sion number 14948). 

Endnote 

^While this manuscript was under review an alterna- 
tive naming and classification system of the NRTl/PTR 
superfamily was proposed by Leran et al. (Trends Plant 
Sci, in press, doi:10.1016/j.tplants.2013.08.008). Largely, 
the supergroups' described here and the clades' defined 
by Leran et al. have good correspondence, albeit rela- 
tionships between supergroups/clades' lack resolution 
and thus correspondence: supergroup A corresponds to 
clade 4; supergroups B, E, and J together correspond to 
clade 6, supergroup C corresponds to clade 3; super- 
group D corresponds to clades 1 and 2; supergroup F 
corresponds largely to clade 8, but the distinct group F 
II-M (defined by OsPTR6 in Figure 1) was placed into 
clade 7; the remainder of clade 7 corresponds to super- 
group G; supergroups H and I together correspond to 
clade 5. For ease of comparison, the names used by 
Leran et al. were added to Additional file 1. 

Additional files 



Additional file 1: Protein sequences included in phylogenetic 
analyses including name used, categorization applied, species of 
origin, database accessions (GenBank for non-plant sequences, 
Phytozome v6.0 for plant sequences), and protein sequence. 

Additional file 2: Maximum Likelihood phylogenetic reconstructions 
of the NRT1 family by supergroup. For each tree, sequer^ces were 
realigr^ed ar^d trimmed separately prior to phylogenetic recor^structior^. 
Groups are defined and colored as in Figure 1. Bootstrap values from 1,000 
replicates are given for branches up to those defining groups only. 
Supergroups C and D, supergroups F and G, and supergroups H and I were 
analyzed together, because they each form monophyletic clans with high 
bootstrap support (see Figure 1) but have bryophyte and lycophyte proteins 
with ambiguous relation to the two supergroups included. 
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